=============== <Original Dataset> =============== <class 'pandas.core.frame.DataFrame'> RangeIndex: 20640 entries, 0 to 20639 Data columns (total 10 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 longitude 20640 non-null float64 1 latitude 20640 non-null float64 2 housing_median_age 20640 non-null float64 3 total_rooms 20640 non-null float64 4 total_bedrooms 20433 non-null float64 5 population 20640 non-null float64 6 households 20640 non-null float64 7 median_income 20640 non-null float64 8 median_house_value 20640 non-null float64 9 ocean_proximity 20640 non-null object dtypes: float64(9), object(1) memory usage: 1.6+ MB None
| longitude | latitude | housing_median_age | total_rooms | total_bedrooms | population | households | median_income | median_house_value | ocean_proximity | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -122.23 | 37.88 | 41.0 | 880.0 | 129.0 | 322.0 | 126.0 | 8.3252 | 452600.0 | NEAR BAY |
| 1 | -122.22 | 37.86 | 21.0 | 7099.0 | 1106.0 | 2401.0 | 1138.0 | 8.3014 | 358500.0 | NEAR BAY |
| 2 | -122.24 | 37.85 | 52.0 | 1467.0 | 190.0 | 496.0 | 177.0 | 7.2574 | 352100.0 | NEAR BAY |
| 3 | -122.25 | 37.85 | 52.0 | 1274.0 | 235.0 | 558.0 | 219.0 | 5.6431 | 341300.0 | NEAR BAY |
| 4 | -122.25 | 37.85 | 52.0 | 1627.0 | 280.0 | 565.0 | 259.0 | 3.8462 | 342200.0 | NEAR BAY |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 20635 | -121.09 | 39.48 | 25.0 | 1665.0 | 374.0 | 845.0 | 330.0 | 1.5603 | 78100.0 | INLAND |
| 20636 | -121.21 | 39.49 | 18.0 | 697.0 | 150.0 | 356.0 | 114.0 | 2.5568 | 77100.0 | INLAND |
| 20637 | -121.22 | 39.43 | 17.0 | 2254.0 | 485.0 | 1007.0 | 433.0 | 1.7000 | 92300.0 | INLAND |
| 20638 | -121.32 | 39.43 | 18.0 | 1860.0 | 409.0 | 741.0 | 349.0 | 1.8672 | 84700.0 | INLAND |
| 20639 | -121.24 | 39.37 | 16.0 | 2785.0 | 616.0 | 1387.0 | 530.0 | 2.3886 | 89400.0 | INLAND |
20640 rows × 10 columns
=============== <Modified Dataset> =============== <class 'pandas.core.frame.DataFrame'> RangeIndex: 20433 entries, 0 to 20432 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 longitude 20433 non-null float64 1 latitude 20433 non-null float64 2 housing_median_age 20433 non-null float64 3 total_rooms 20433 non-null float64 4 total_bedrooms 20433 non-null float64 5 population 20433 non-null float64 6 households 20433 non-null float64 7 median_income 20433 non-null float64 8 ocean_proximity 20433 non-null object dtypes: float64(8), object(1) memory usage: 1.4+ MB None
| longitude | latitude | housing_median_age | total_rooms | total_bedrooms | population | households | median_income | ocean_proximity | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | -122.23 | 37.88 | 41.0 | 880.0 | 129.0 | 322.0 | 126.0 | 8.3252 | NEAR BAY |
| 1 | -122.22 | 37.86 | 21.0 | 7099.0 | 1106.0 | 2401.0 | 1138.0 | 8.3014 | NEAR BAY |
| 2 | -122.24 | 37.85 | 52.0 | 1467.0 | 190.0 | 496.0 | 177.0 | 7.2574 | NEAR BAY |
| 3 | -122.25 | 37.85 | 52.0 | 1274.0 | 235.0 | 558.0 | 219.0 | 5.6431 | NEAR BAY |
| 4 | -122.25 | 37.85 | 52.0 | 1627.0 | 280.0 | 565.0 | 259.0 | 3.8462 | NEAR BAY |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 20428 | -121.09 | 39.48 | 25.0 | 1665.0 | 374.0 | 845.0 | 330.0 | 1.5603 | INLAND |
| 20429 | -121.21 | 39.49 | 18.0 | 697.0 | 150.0 | 356.0 | 114.0 | 2.5568 | INLAND |
| 20430 | -121.22 | 39.43 | 17.0 | 2254.0 | 485.0 | 1007.0 | 433.0 | 1.7000 | INLAND |
| 20431 | -121.32 | 39.43 | 18.0 | 1860.0 | 409.0 | 741.0 | 349.0 | 1.8672 | INLAND |
| 20432 | -121.24 | 39.37 | 16.0 | 2785.0 | 616.0 | 1387.0 | 530.0 | 2.3886 | INLAND |
20433 rows × 9 columns
=============== AutoML Start =============== =============== Model : DBSCAN =============== min_samples = 100 / eps = 0.3 / metric = euclidean Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 20433 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 179800.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 207308.787696 Name: median_house_value, dtype: float64 min_samples = 100 / eps = 0.3 / metric = manhattan Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 20433 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 179800.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 207308.787696 Name: median_house_value, dtype: float64 min_samples = 200 / eps = 0.3 / metric = euclidean Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 20433 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 179800.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 207308.787696 Name: median_house_value, dtype: float64 min_samples = 200 / eps = 0.3 / metric = manhattan Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 20433 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 179800.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 207308.787696 Name: median_house_value, dtype: float64 min_samples = 100 / eps = 1 / metric = euclidean Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 3293 0.0 2346 1.0 13618 2.0 1176 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 0.0 500001.0 1.0 500001.0 2.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 187500.0 0.0 227800.0 1.0 174300.0 2.0 160150.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 0.0 22500.0 1.0 14999.0 2.0 40000.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 216074.700881 0.0 246892.227195 1.0 200529.627992 2.0 182300.025510 Name: median_house_value, dtype: float64 min_samples = 100 / eps = 1 / metric = manhattan Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 17576 0.0 222 1.0 2635 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 0.0 500001.0 1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 175750.0 0.0 263200.0 1.0 190000.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 0.0 55000.0 1.0 38800.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 204108.153334 0.0 287392.013514 1.0 221910.637192 Name: median_house_value, dtype: float64 min_samples = 200 / eps = 1 / metric = euclidean Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 5175 0.0 2029 1.0 12437 2.0 792 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 0.0 500001.0 1.0 500001.0 2.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 180700.0 0.0 225000.0 1.0 175000.0 2.0 160250.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 0.0 22500.0 1.0 14999.0 2.0 40000.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 210905.057391 0.0 244709.878265 1.0 201351.750905 2.0 181538.785354 Name: median_house_value, dtype: float64 min_samples = 200 / eps = 1 / metric = manhattan Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 19257 0.0 1176 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 0.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 178100.0 0.0 194900.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 0.0 47600.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 206309.705977 0.0 223668.750850 Name: median_house_value, dtype: float64 min_samples = 100 / eps = 1.5 / metric = euclidean Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 800 0.0 19633 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 0.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 190600.0 0.0 179500.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 22500.0 0.0 14999.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 220657.056250 0.0 206764.876178 Name: median_house_value, dtype: float64 min_samples = 100 / eps = 1.5 / metric = manhattan Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 10652 0.0 1594 1.0 1733 2.0 6095 3.0 359 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 0.0 500001.0 1.0 500001.0 2.0 500001.0 3.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 173150.0 0.0 218400.0 1.0 111800.0 2.0 196200.0 3.0 150000.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 0.0 22500.0 1.0 27500.0 2.0 17500.0 3.0 40000.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 201712.594255 0.0 239031.873275 1.0 140027.188113 2.0 229736.254799 3.0 176522.309192 Name: median_house_value, dtype: float64 min_samples = 200 / eps = 1.5 / metric = euclidean Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 1182 0.0 19251 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 0.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 193800.0 0.0 179200.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 22500.0 0.0 14999.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 219859.526227 0.0 206538.179783 Name: median_house_value, dtype: float64 min_samples = 200 / eps = 1.5 / metric = manhattan Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict -1.0 14431 0.0 1093 1.0 4909 Name: median_house_value, dtype: int64 ===max=== predict -1.0 500001.0 0.0 500001.0 1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict -1.0 166900.0 0.0 211600.0 1.0 196100.0 Name: median_house_value, dtype: float64 ===min=== predict -1.0 14999.0 0.0 22500.0 1.0 38800.0 Name: median_house_value, dtype: float64 ===mean=== predict -1.0 197115.562955 0.0 236359.169259 1.0 230805.703402 Name: median_house_value, dtype: float64